Hypothesis testing and ROPEs
Frequentist / Maximum likelihood:
Bayesian:
\[Pr[\theta | Data] = \frac{Pr[Data | \theta] \times Pr[\theta]}{Pr[Data]}\]
- 1MassCaste probably not interesting in this exampleMass = 0 mean for a naked mole rate? Family: gaussian
Links: mu = identity; sigma = identity
Formula: Energy ~ Mass + Caste - 1
Data: M (Number of observations: 35)
Draws: 4 chains, each with iter = 40000; warmup = 20000; thin = 1;
total post-warmup draws = 80000
Population-Level Effects:
Estimate Est.Error l-89% CI u-89% CI Rhat Bulk_ESS Tail_ESS
Mass 0.89 0.18 0.60 1.18 1.00 15525 19642
CasteNonWorker -0.09 0.89 -1.51 1.33 1.00 15527 19894
CasteWorker 0.30 0.79 -0.96 1.56 1.00 15509 19288
Family Specific Parameters:
Estimate Est.Error l-89% CI u-89% CI Rhat Bulk_ESS Tail_ESS
sigma 0.31 0.04 0.25 0.38 1.00 24236 26368
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
library(tidybayes)
post <- fm |>
spread_rvars(b_Mass, b_CasteWorker, b_CasteNonWorker) |>
mutate(d_W_NW = b_CasteWorker - b_CasteNonWorker, .keep = "unused")
head(post)# A tibble: 1 × 2
b_Mass d_W_NW
<rvar[1d]> <rvar[1d]>
1 0.89 ± 0.18 0.39 ± 0.15
# A tibble: 1 × 9
b_Mass b_Mass.lower b_Mass.upper d_W_NW d_W_NW…¹ d_W_N…² .width .point .inte…³
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 0.890 0.604 1.18 0.391 0.161 0.626 0.89 median hdi
# … with abbreviated variable names ¹d_W_NW.lower, ²d_W_NW.upper, ³.interval
Many interesting visualizations in the see package
hypothesis()Similar to generalized linear hypothesis tests (glht() in multcomp)
Hypothesis Tests for class b:
Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
1 (CasteWorker-Cast... = 0 0.39 0.15 0.37 0.41 NA
Post.Prob Star
1 NA *
---
'CI': -78%-CI for one-sided and 11%-CI for two-sided hypotheses.
'*': For one-sided hypotheses, the posterior probability exceeds 11%;
for two-sided hypotheses, the value tested against lies outside the 11%-CI.
Posterior probabilities of point hypotheses assume equal prior probabilities.
gq> statement in ulam() modelstan code, and use rstan or cmdstanrgenerated quantities block lets you access the samples at each iteration
// generated with brms 2.18.0
functions {
}
data {
int<lower=1> N; // total number of observations
vector[N] Y; // response variable
int<lower=1> K; // number of population-level effects
matrix[N, K] X; // population-level design matrix
int prior_only; // should the likelihood be ignored?
}
transformed data {
}
parameters {
vector[K] b; // population-level effects
real<lower=0> sigma; // dispersion parameter
}
transformed parameters {
real lprior = 0; // prior contributions to the log posterior
lprior += normal_lpdf(b | 0, 3);
lprior += student_t_lpdf(sigma | 3, 0, 2.5)
- 1 * student_t_lccdf(0 | 3, 0, 2.5);
}
model {
// likelihood including constants
if (!prior_only) {
target += normal_id_glm_lpdf(Y | X, 0, b, sigma);
}
// priors including constants
target += lprior;
}
generated quantities {
}
null = 0).Can also be used for:
“While Bayes factors provide an immediate approach to hypothesis testing, they are highly sensitive to details of the data/model assumptions.” (Schad et al. 2022)
“A difference of zero plus or minus some small amount is for practical purposes ‘not different’ from zero.”
Easystats has a good introduction.
Additional reading:
rope()rope_range() to automatically find the most suitable range for comparison.
ci = 1)rope()# Proportion of samples inside the ROPE [-0.25, 0.25]:
Parameter | inside ROPE
-----------------------
Mass | 0.03 %
d_W_NW | 16.16 %